1 ATGCTGGGGG 
51 AGACATCTAC 
101 TTGTAGCCCT 
151 ATGCAGCTGG 
201 CTTCTACACG 
251 CTGTGAAATA 
301 GAGAATCTGC 
351 CGACTTTGGC 
401 CCTGTGGAAC 
451 TACAGCAAGG 
501 GCTCTGCGGT 
551 AACAGATTTT 
601 ATCTCTGACT 
651 AGAGAAAAGA 
701 GAGATACAGC 
751 AAGAAGAACT 
801 TGTGGTGCGG 
851 AGGGGCAGAC 
901 CCGGCAGCTG 
951 ACTGTCCCCC 



CAGTGGAAGG 
GACTTCCGAG 
GGATGACATC 
TGTCGGGTGG 
GAGCGGGACG 
CCTGCATGAC 
TGTACTACAG 
CTCTCCAAGA 
TCCGGGATAC 
CTGTGGATTG 
TACCCTCCCT 
GAAGGCCGAG 
CTGCCAAAGA 
TTCACCTGTG 
TCTAGATAAG 
TTGCCAAGAG 
CACATGAGGA 
GGCGAGCCAT 
GCTGTTGCTG 
ACACTGCCCC 



CCCCAGGTGG 
ATGTTCTGGG 
TATGAGAGTG 
GGAGCTCTTT 
CCAGCCGCCT 
CTGGGCATTG 
CCTGGATGAA 
TGGAGGACCC 
GTGGCCCCTG 
CTGGTCCATA 
TCTATGACGA 
TACGAGTTTG 
TTTCATCCGG 
AGCAGGCCTT 
AATATCCACC 
CAAGTGGAAG 
AACTGCAGCT 
GGGGAGCTGC 
TCGAGACTGC 
ACCAGCTCTA 



AAGCAGGCGG 
CACGATCAAG 
GGGGCCACCT 
GACCGTATTG 
CATCTTCCAG 
TACACCGGGA 
GACTCCAAAA 
GGGCAGTGTG 
AAGTCCTGGC 
GGTGTCATCG 
GAATGATGCC 
ACTCTCCTTA 
CACTTGATGG 
GCAGCACCCA 
AGTCGGTGAG 
CAAGCCTTCA 
GGGCACCAGC 
TGACACCAGT 
TGCGTGGAGC 
G (SEQ ID 



AGGACATTAG 
CACCCCAACA 
CTACCTCATC 
TGGAAAAAGG 
GTGCTGGATG 
TCTCAAGCCA 
TCATGATCTC 
CTCTCCACCG 
CCAGAAGCCC 
CCTACATCTT 
AAACTCTTTG 
CTGGGACGAC 
AGAAGGACCC 
TGGATTGCAG 
TGAGCAGATC 
ATGCCACGGC 
CAGGAGGGGC 
GGCTGGGGGG 
CGGGCACAGA 
NO:l) 



jjaisi 



£3 j 



W 

U1 



FEATURES: 

Start Codon: 1 
Stop Codon: 979 

Homologous proteins: 

TOD 10 BLAST Hits 



CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 



18000004983962 /altid=gi 14502553 /def=ref |NP_003647.1| (NM_. . 
18000004936440 /altid=gi 13122310 /def=sp|Q63450|KCCl_RAT CA. . 
223000002652742 /a"ltid=gi 115928726 /def=gb|AAHl4825.1|AAHl4. . 
18000005144641 /a1tid=gi 1 3114436 /def=pdb 1 1A06 1 Calmodul in . . 
18000004932361 /altid=gi 1406113 /def=gb|AAAl9670.1| (L24907. . 

' 9966875 /def=ref | NP_065130. 1| (NM. . 
14422219 /def=emb | CAC41379 . 1 | (AL . . 
16755792 /def=gb | AAL28100 . 1 I AF428 . . 
14196445 /def=ref|NP_065172.1| (N. 



117000066864297 /altid=gi 
149000126089143 /altid=gi 
224000007378166 /a"ltid=gi 
114000110934306 /altid=gi , 

18000005191499 /altid=gn 14007153 /def=emb|CAAl9296.1| (AL.02.. 



Score 


E 


661 


0.0 


642 


0.0 


641 


0.0 


556 


e-157 


548 


e-155 


470 


e-131 


398 


e-109 


398 


e-109 


398 


e-109 


398 


e-109 
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Blast hits to dbEST: 
CRA Number gi 



Number Score 



58000099505996 
164000139365918 
58000099322782 
78000169264025 
225000015220001 
225000015219990 
225000001633290 
61000077034868 
335000490524629 
146000055060127 
61000077035673 
1000488750278 gi 
225000001678100 
222000012165952 
224000004550264 
162000005790240 
225000000831163 



91 
91 
91 
91 
91 
91 
91 
91 
91 
91 
91 



12943070 1459 
12675371 1415 
12899184 1215 
14067900 1134 
18523306 1130 
18523305 1130 
15750044 1102 
14446412 1063 
8278341 995 bits 
10205334 954 bits 



Expect 



14446457 890 bits 



bits (736) 
bits (714) 
bits (613) 
bits (572) 
bits (570) 
bits (570) 
bits (556) 
bits (536) 
(502) 
(481) 



15128333 884 bits (446) 



(449) 
0.0 



91 
91 

gi 
gi 



15752845 882 bits (445) 

18781967 850 bits (429) 

15947485 718 bits (362) 

9185548 706 bits (356) 



gi 1 15496148 317 bits (160) 



0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 

0.0 
0.0 
0.0 
0.0 
6e-85 



EXPRESSION INFORMATION FOR MODULATORY USE (library source) 

gi Number Organ Tissue Type 



gi 110205334 
gi I 15496148 



eye 
brain 



retinoblastoma 
hypothalamus 
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1 MLGAVEGPRW KQAEDIRDIY DFRDVLGTTK HPNIVALDDI YESGGHLYLI 
51 MQLVSGGELF DRIVEKGFYT ERDASRLIFQ VLDAVKYLHD LGIVHRDLKP 
101 ENLLYYSLDE DSKIMISDFG LSKMEDPGSV LSTACGTPGY VAPEVLAQKP 
151 YSKAVDCWSI GVIAYILLCG YPPFYDENDA KLFEQILKAE YEFDSPYWDD 
201 ISDSAKDFTR HLMEKDPEKR FTCEQALQHP WIAGDTALDK NIHQSVSEQI 
251 KKNFAKSKWK QAFNATAWR HMRKLQLGTS QEGQGQTASH GELLTPVAGG 
301 PAAGCCCRDC CVEPGTELSP TLPHQL (SEQ ID NO: 2) 

FEATURES: 

Functional domains and key regions: 

Prosite results: 

PDOC00001 PSOOOOl ASNLGLYCOSYLATION 
N-glycosylation site 

264-267 NATA 

PDOC00004 PS00004 CAMP_PHOSPHO_5TTE 

cAMP- and cGMP-dependent protein kinase phosphorylation site 
219-222 KRFT 

PDOC00005 PS00005 PKC_PHOSPHO_STTE 
Protein kinase C phosphorylation site 
Number of matches: 3 

1 28-30 UK 

2 70-72 TER 

3 204-206 SAK 

PDOC00006 PS00006 CK2_PHOSPHO_SrrE 
Casein kinase II phosphorylation site 
Number of matches: 9 



1 


55-58 


SGGE 


2 


70-73 


TERD 


3 


107-110 


SLDE 


4 


122-125 


SKME 


5 


204-207 


SARD 


6 


236-239 


TALD 


7 


245-248 


SVSE 


8 


279-282 


TSQE 


9 


289-292 


SHGE 



PDOC00007 PS00007 TYR_PHOSPHO_SrTE 
Tyrosine kinase phosphorylation site 
62-69 RIVEKGFY 

PDOC00008 PS00008 MYRISTYL 

N-myristoylation site 
Number of matches: 3 



1 


128-133 


GSVLST 


2 


283-288 


GQGQTA 


3 


299-304 


GGPAAG 



PDOC00100 PS00108 PROTEIN_KINASEJ5T 

Serine/Threonine protein kinases active-site signature 

93-105 IVHRDLKPENLLY 
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Membrane spanning structure and domains: 
Helix Begin End Score Certainty 

1 127 147 0.883 Putative 

2 155 175 1.515 Certain 

3 288 308 0.649 Putative 

BLAST Alignment to Top Hit: 

>CRA 1 18000004983962 /a"ltid=gi 14502553 /def=ref |NP_003647.1| 

(NM.003656) calcium/calmodul in-dependent protein kinase I 
[Homo sapiens] /org=Homo sapiens /taxon=9606 /div=PRl 
/dataset=nraa /length=370 
Length = 370 

Score = 661 bits (1688), Expect = 0.0 

Identities = 326/370 (88%), Positives = 326/370 (88%), Gaps = 44/370 (11%) 
Frame = +3 

Query: 126 MLGAVEGPRWKQAEDIRDIYDFRDVLGT 209 

MLGAVEGPRWKQAEDIRDIYDFRDVLGT 
Sbjct: 1 MLGAVEGPRWKQAEDIRDIYDFTOVLGT^ 60 

Query: 210 IKHFtJIVALDDIYESGGHLYLIMQLVSGGELroRIVEKGFYTERDASR 353 

IKHPNIVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDASR 
Sbjct: 61 GSMENEIAVLHiaK^PNIVALDDIYESGGHLYLII^LVSGGELmRIVEKGFYTERDASR 120 

Query: 354 LIR^VLDAVKYLHDLGIVHRDLKPEN^ 533 

LIFQVLDAVKYLHDLGIVHRDLKPENLLWSLDEDSICEMISDFGLSKMEDPGSVLSTACG 
Sbjct: 121 LI^VLDAN^LHDLGIN^HRDLKPENLLYYSLDEDSKIMISDFGLSKMEDPGSVLSTACG 180 

Query: 534 TPGWAPEVU\QKPYSKAVDCWSIGVIAYILLC^ 713 

TPGWAPEVUVQKPYSKAN^X^IGVTAYILLCGYPPFYDENDAKLFEQILKAEYEroSP 
Sbjct: 181 TPGWAPEVLAQKPYSKAVDCV\SIGV1AYILLCGYPPFYDENDAKLFE^ 240 

Query: 714 YWDDISDSAKDFimiJvlEKD^ 893 

YWOHSDSAKDFIRHIJVEKDPEKRFTCEQALQH 
Sbjct: 241 YWDDISDSAKDFIRHIJVIEKDPEKRFTCEQALQHP^ 300 

Query: 894 SKWKQAFNATANA/T^RKLQLGTSQE 1073 

SKWKQAFNATAWRHMRKLQLGT^ 
Sbjct: 301 SKV^QAFNATAWRHMRKLQLGTSQEGQGQTASHGELLTPVAGGPMGCCCRTC 360 

Query: 1074 ELSPTLPHQL 1103 
ELSPTLPHQL 

Sbjct: 361 ELSPTLPHQL 370 (SEQ ID NO:4) 
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Hrmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 



Model 


Descri Dti on 


Score 


E-val UG 


N 


PF00069 


Eukarwytic nro"fpin kina^p domain 


271 6 


1 le-77 


1 

-I- 


CE00022 

V_ 1 — \J \J\J 


CE00022 MAGUK subfamilv d 


119 7 


3 le-35 


1 


CE00359 


E00359 bone_morphogeneti cprotei n_receptor 


6.5 


0.36 


1 


CE00031 


CE00031 VEGFR 


4.3 


0.2 


1 


PF01496 


v-type ATPase 116kDa subunit family 


1.8 


7.5 


1 


CE00292 


CE00292 PTK_membranGL_span 


-89.8 


0.0011 


1 


CE00287 


CE00287 PTK_Eph_orphan_receptor 


-96.6 


0.044 


1 


CE00291 


CE00291 PTiefgf_receptor 


-123.4 


0.11 


1 


CE00286 


E00286 PTK_EGF_receptor 


-151.4 


0.095 


1 


CE00290 


CE00290 PTK_Trlcfamily 


-204.5 


0.4 


1 


CE00016 


CE00016 GSK_glycogerL_synthase_kinase 


-271.7 


0.12 


1 



Parsed for domains: 



Model 


Domain 


sea-f 


sea-t 


hmm-f hrnn-t 




score 


E-val ue 


PF01496 


1/1 


85 


95 .. 


1 


11 [. 1.8 


7.5 


CE00359 


1/1 


93 


146 .. 


272 


327 




6.5 


0.36 


CE00031 


1/1 


79 


163 .. 


1053 


1137 




4.3 


0.2 


CE00290 


1/1 


2 


212 .. 


1 


282 




-204.5 


0.4 


CE00292 


1/1 


3 


227 .. 


1 


288 




-89.8 


0.0011 


CE00291 


1/1 


1 


230 [. 


1 


285 




-123.4 


0.11 


CE00287 


1/1 


2 


230 .. 


1 


260 




-96.6 


0.044 


CE00286 


1/1 


1 


230 [. 


1 


263 




-151.4 


0.095 


CE00022 


1/1 


30 


232 .. 


75 


283 




119.7 


3. le-35 


PF00069 


VI 


25 


232 .. 


43 


278 


.] 271.6 


1. le-77 


CE00016 


1/1 


1 


302 [. 


1 


433 [] -271.7 


0.12 
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1 AAACCGACCT TTGGCCTCTT 
51 CTAACCTGGA CCCCAGCCAT 
101 ACCTCGGTCC 1 1 1 1 IGGCCT 
151 CCTCCCCAAT TCCCTCTGCC 
201 TGAGGAGCAA GTGTCTGGGG 
251 CCGGGGCCTC AGTCTTCCCA 
301 ACCTCCACTC tcttcattgc 
351 CACTCCCCCA TCCCCCACCA 
401 CTTCAGGGCA CGGAGAAGGG 
451 GGTCTGAAAG AAAATCCACC 
501 GCTTCAGACA GACGGGGTTC 
551 ATTTGAAAAT CATCTCTCCG 
601 CCCGCTTAAG ACCAAGGGCG 
651 AAGCAGGGAG ATGGCCTCCA 
701 GCCTCGCCTC CCAATCCCGG 
751 ACATCAGAGC CGCAGGCGGG 
801 CAGCTCCAGC AAGAGCGCGG 
851 CCGCGGCCAC AGCTCGGCGC 
901 CGGCGGGGCA GCCGCAGGTA 
951 GCGCTGCGTG GGGGCGGTGG 
1001 AGGAGTGGCC ACTGGGCACT 
** 1051 GGCTGGGCCT TTCTCCACCT 
0 1101 CCTATGGGAC ACTGAGTGGG 
P 1151 TCAGCTGGGC CTGACTGGAA 
*0 1201 TGGGGGGGCT TGCGGAGTGT 
Q 1251 ATGGGTGGAG GATCTGGGAA 
S3 1301 CACGTGTTTT GTGTGCTGGT 
0 1351 AGGAGGGGGT GTACAATTGT 
m 1401 AGGACTTGGG GTTTGGTTTG 
1451 AGGAGCTGGA GGGTGTAGGG 
h 1501 GTGTCTGGGT GTCATCTTGT 
M 1551 GTAAGAGGGA GCTGGGTGAG 
m 1601 GTGCAGGGCT GTCTGGGGCA 
£ 1651 TGGGAGTACA TGTATGATCA 
|jj 1701 TGGATCTGGG AGGCAGGTGT 
K 1751 GTGGCATGTC TGCTACAGGG 
sy 1801 GTAGGCCTCC ATGTGGGTTT 
1851 TCTCCTTTCC TAGAAGATTC 
1901 GGAAAAAGGT CCTGTTCCAG 
1951 AGGGCACCTT CTCCTCATTC 
2001 CTCTCCCTGA GGTCACAAAG 
2051 GGATGGGTTG GGAGGCTGTG 
2101 ATTGGAAGTA CCAGCCAAGG 
2151 TAGCTGGGGG CCCCAGGCCA 
2201 TACCTAACTC CAACCTCAGT 
2251 GAACTCCCAC CCTCTCCTGC 
2301 CCAGGCTGGA TAGGGTTAGA 
2351 CCATTTCTCA TGCTCCCGGG 
2401 TGGGGAGCTC CAGCTGTGGC 
2451 GCTATTTAAA AATGTGGCTG 
2501 ACCTGATTTC GGAGAGGCCG 
2551 ACACCCCAGC AATCCCCAGG 
2601 CAGGGGGATC CTCCAGGCTT 
2651 TACTGCCAGA AGTCAGGGTT 



GCCTGCCGTC CTAGTTGCAG GCTCTCTCCC 
CAAACTCTGG AGCCCCGCCA GTCACGTGAC 
GTTTCCTTCA GGATCCCGAT TTAACTTCCT 
CCCAATACCT CTAGGCACCA CCACCCGCTC 
CTGAAGCCTC AGCTCCATCT TGCAGAGGAA 
CCTGTCAAGT GGGGCCCACA CCCTGCGACC 
CTAGTCTTGC CCGGTCCTTC CCCACTCCCT 
GACTCCCGTG CAGTTCCAGG GCCTGTTTCC 
AGACAGAGCC CTAAGGGAGG TCGCAGAACT 
AGGCCACAGG GTGAGT7TGG CCGGCCTCTA 
GAATCCTGCT TTGCTTCCGA CCACCCGCTG 
GGCCTCAGTT GTCCCCTCTG TGAAATGGAC 
GGAAGCGTCC AGCAGGAGAT CTCTGACCAG 
CCCGTGCCCC TTCCCCAGCC TTGGAGCGGT 
GTCCCTCCGC CGCAGGCTCC ACCTCCACTG 
CGGAGAGAGC CGCCGAGCCG AGCCGAGCCC 
GCGGGTGGCC CAGGCACGCA GCGGTGAGGA 
CAACCACCGC GGGCCTCCCA GCCAGCCCCG 
CAGCCGGGCC CCCCATCCCT GCACCCCTGG 
GAGCCCCTAG CCTCTGGGTA TCCTTTCCCA 
CTCCCGGGCG GGCTGGACCC TGAGGGGCAG 
CTGTCCCAGG CCCAGCAGGT GCCAGGCGGG 
TAATAGAGAA GGGGGCCTGT GTGAGCGCCT 
GGGCGTGGGC ATTTGGAGGT ATCCATGGGG 
ACTGTCTTAG GACAGGCGTG TGGGTCAGAC 
TCTGTGTGTT I I I IGI I CCA GAGGGGTGTC 
ATTTGGCTCT CAGGGTCTTA AGTCAGAGTT 
GTACTGAGGA TGTTTGGAGT TAGGTGTGTA 
GAATACAGGA GCTTCCAGGG GATGGGGTAG 
TACGTCTGGT ATATGAGGGT GTGTGTGTGT 
GTGGGTGCGG GTGGATGTGT GTTTTGGGGT 
GGATGTTTGG ATGGACAGGC AGGTGTTCGG 
CTGTGTGGTG TGGACATGTG TGCTGATGTC 
GGTGTCACGG GATGTGGATA CAAGGCGTAC 
TTGAGTTCAG GGCTGTGGAG GGGGCTTGGT 
ATGTGTGTGG ATCTGTGAGG GTTGTATTTG 
CAGACTCTGC CTCTAGAGCT TACACTCGAG 
TGCCCCTGGA TGGGTGGGCA GGGTCCCCTG 
GAGTGGAATC TCACACCAGA GGCCCTAGTC 
TCCCTTAGAG AAAAAGAGAG AAGGAAAGTG 
CATGCTGGGC TCTGTTTTGG CCTCATCTGT 
TTCTCTGAAT GGGGCCCATT CTGGCTTCAT 
CCATTCGATG GCCTTTGCCC TCAGCAAGCT 
GGTGTCATTA GGGCCTCTGG AGCCAGCCTC 
CTCCCCATTC TTCATCTGAT AAATGGGAGA 
TGGATGAGAC AGACCTCAGC AGAGGAAGGG 
TGGGGCCAGG AAGGGACAGA GTGAGCAGGA 
ACCCAGATGG GGAGTCAGGA GGGAGAGGTC 
TGTTGTTGCT GTGGTAACAG TGCAGAAAGA 
AGATGTTGCT GGAAGCCCAG GCTGCTGGAA 
GGGAGTCGGG GGAAGGAGGA GGGAAAGGAG 
GTGGGGCGGG GACATCACTG GTTCTGGGGA 
CTACCAGCTG CTCTGGGGGT TTATCTGTTG 
TCCCTAGGTG CTTGGATTTG GATAGGGGGA 
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2701 AAACTGGGAA GAGAACTAGA ATAAATGAAT GAATGAATGC ATGACTTTGT 
2751 TAAATAAAGA ATTTTGCTGC CACTGTGAAA GGTTTTTCTC TAGGCATGAG 
2801 AATTTGCTGA ATGTTGAATA AACAAATGAA TGTTTGTTGA ATGATTTTGT 
2851 CAAATGGATG AATCAAGGAT GAATAAATGC AGGTTGAATG ACTGAATGGG 
2901 GCCTGCAGTA AATTCCCAGA CAGAGGGCTG GGCTCTGCTG AGTCTCCTCC 
2951 TTCCATTCTC CTTACAGGAG CCCTGGCTGT GGTCGGGGGG CAGTGGGCCA 
3001 TGCTGGGGGC AGTGGAAGGC CCCAGGTGGA AGCAGGCGGA GGACATTAGA 
3051 GACATCTACG ACTTCCGAGA TGTTCTGGGC ACGTGAGTCC AGGGCAGGAT 
3101 TGGGTGCTGG ATGGCTGAGG GAGGCTGAGT CCAGGGTGGG GCTTCCTCTG 
3151 GTCAATTAAT GCTTCCTGTT TCCCACAGCC CAGGCCCTGT GGCAGCACTA 
3201 TCTAGGGCCT AAACTGTCCC CAGCTTTTCA CTTCTGGATG ACAGTGGGTG 
3251 GGACACGGGC TGCTCTCCCA ATAGCCCTGG GTTCTTGAAG AGAAAGAAGT 
3301 CGAGAGAATG AAGGTGCCAG TCAGTCCATT TAACTTGCTG CCAAGAGCTA 
3351 AGTGTTCTAG CCTAGGTTTG GGAACTGAGG CTGGAGATGG CTCTGTTCTT 
3401 GGTGCTGGGA ATGCAGAAAT AACTCAAACC TGGTCTCTGC CCTTCAAGTT 
3451 GATCCCAGAC ATGTGCAAGA GACAGACCTA CAGAAAATGA CAACAGGGTG 
3501 TGTGCTGTGC TCCAATTAAG GTTGGGATTG AGGGCTTTGT GGAGCCCAGA 
3551 GAGAGCTGTG CCTTCTGCCT GGGGGAAAAC TTCCTGGAGA ATGGGGCATT 
3601 AGAGCTGGGG ACTGAAGGAT GGGTAGGTGT GCACTTGTCA GAGAGGAAGA 
3651 AGGACATTCC AGGCAGAAGG AATAGCATAA ACAAAGGCTT AGAGGCATGG 
3701 TTCTATGTGG AGAGAGGTAG AGTGTGATGG AGCTTAAAAT CACAGGCTGG 
3751 GGGGAGAGTG GAAAAAGGGG CTGGAGATGA AAGTGGGACA GTTTGTGTAG 
3801 GGTTTTGGAA GCCAGGCCAG GGAGGCTGGA TATTGTCCCA TAGGCCACCG 
3851 GGAGACACTT AAGALI I 1 1 I GGCAGGTGTG CAATTCAGGA TAGTCACTCT 
3901 GGCCACAGCT TGGAGGGTAA ATTGGAGAGG GACAAGACTG GAAACCAGTG 
3951 ATGAGGTTAC TACAGTAACT AATTATCCCT GAGGATTGAA ATTTCACCAC 
4001 GAGAGATGCT TTTCTTTGAC TTATGACTTC TTATTCTCCC AGAGAAAGCA 
4051 AACAGATGTG GAAAGAATAC CCTAGCAAAT CCTCTTTAAT CAGTTAACTT 
4101 TAGTTAAATG AGTTTATTTG TTCCTTTTTA AGAACCTGTT CTAAAACACT 
4151 GCTTCTTAAA GTTCAATGAG CATACAAATC ACCTGAGGAT I I IGI IAAAC 
4201 TGCAGA TTGA TTTAGTAAAT CTGGGGCAGG GCCTAAAGTT TTGCATTTCT 
4251 1 1 1 1 I 1 1 I IC I 1 1 I 1 1 I IGA CCCAGGATCC AAAGCAGTAG AGATTTTGCA 
4301 TTTCTAAAAA AGTTCCCGGG TGATGCTGAT GGTTCTTTAA GGTTCTAAAG 
4351 GGTGTTAAAT TAGCCATGAC TCGAATTAGC AGAAAAAGGG ATGAACCAAC 
4401 TGTACACATA ATCCAAAAGC CCAGGGGTAG ACCTCAGGCA TGGCTGGATC 
4451 CAGAGGGCCA CATAATGTTA TCAGGAAATA TATTTGGCCA TTTCTCAGGT 
4501 TGGACTTCCT TTGTGTTAAT TTCATTCCCA AGCAGGCTCT CCCCAGGTGG 
4551 TGGCAAAGAT GATCGCCATT AGCTCCAGGC TTACATCCTA CCAGCTCAAC 
4601 AGGAGACTCA TTCTCAAAGT GCTAGTAAGC TGGCTTGCAT CACATGACCA 
4651 ATTACTGTGG CCAGGGGAGA GACTACTTTG ACTGGCCAGG CCTGGGTCAT 
4701 GTGACCATCT CTGGAGCCAG GGGATGGATG AGTGACTAGG GGAGGGTCAT 
4751 CCACGTCCTT GGTCCAGCAG TGGTCACAGA ACCCATAGGG AATGGAGGAG 
4801 AGGCTGGAGG GAAGCTGGGG TTCCAGTTCT TCACCTTGTG AATCCCCTCT 
4851 CCCGATAGGG GGGCCTTCTC GGAGGTGATC CTGGCAGAAG ATAAGAGGAC 
4901 GCAGAAGCTG GTGGCCATCA AATGCATTGC CAAGGAGGCC CTGGAGGGCA 
4951 AGGAAGGCAG CATGGAGAAT GAGATTGCTG TCCTGCACAA GTGCGTGGGC 
5001 CACAGCCTTT CCCTGCCCCA AGCTGACCCT GCCTTGGCCC TCCCATCCTC 
5051 CTCCTTTCCT GCTTTGGACA AATCATTTAA ACTCTCTAAG CCTTAAATTG 
5101 CCCCTTTATA AAATGGGGAT CACAATTTCC ACTTGGCAGG GTTGTGGGGA 
5151 ACATCAGAAG TCCTTTATTT CAAGTGCCTG GCCTAACATG ACAGATGTGA 
5201 TGGAGGTGCC AGTGCTTAGT CACAGGGGTT TAACTGTTCA ATCAGGTGTA 
5251 AAGATCCATC CTGAACATGG CTTGGACCCA CATATCTCAG TTGGTGTTGT 
5301 CTCTGGACCT ACCTCAAGTT CCCCTCACAT ATTAAAACCA CTCAGCAAGT 
5351 TTAAAAATGA CTGTCTGCTG ACCCCCAGAC TAAATCCACA ACCAACTGGT 
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5401 CTATGAATTG CTCATGCTGA TATGAAACCT CCTGTCCTCA CTGGAAAACT 
5451 TACAGAGAAT CACTTCCAAT CTCTCCCCTG AGCTTCCAAC CACCCTGGGC 
5501 AGATAATTTT 1 1 1 1 I 1 1 I I I TTGAGATGGA GTCTCACTCT GTTGCCCCGG 
5551 CTGGAGTGCA GTGACGCAAT CTTGGCTCAC TGCAACCTCT GCCTCTTGGG 
5601 TTCAAGCAAT TCTCTTGCTT CAGCCTCCCT AGTAGCTGGG ATTACAGGCA 
5651 CCTGCCACCA CGCCCGGCTA Al 1 1 1 IGTAT TTTTAGTAGA GATGGGGTTT 
5701 CGCCATGTTG GCCAGGCTGG TCTCGAACTC CTGACCTCAG GTGATCCACC 
5751 CGCCTCGGCC TCCCAAAGTG CTAGGCATGA GCCACCACAC CCAACTCCTG 
5801 GCAGAGCATT TCTAATAAGA CCCAGAGAGG ACAGGGATTT GTATACAGTC 
5851 ACATGGCAAG TTTGTGGCAG AGCTGAGCCT TCCTCATCAT CAAGATCAAT 
5901 TATCGCCTGA CCAACACGGA GAAACCCTGT CTCTACTAAA AATACAAAAT 
5951 TAGCCAGGCG TGGTGGCACA TGCCTGTAAT CCCAGCTACT TGGGAGGCTG 
6001 AGGCAGGAGA ATTGCTTAAA CCCGAGAGGT GGAGGTTGCG GTGAGCCGAG 
6051 ATCACACCGT GCATTACACT CCAGCCTAGG CAACAAGAGC AAAACTCCAT 
6101 CTCAAAAAAA AAAAAAAAAC AAAAAAAAAA CAAAAACGCC AGGCGCAGTG 
6151 GCTCACGCCT GTAATCCCAG CACTTTGAGA GGCTGAAGTG GGCAGATCAC 
6201 CTGAGGTGGG GAGTTCCAAA CCAGCCTGAC CAACATGGAG AAACTCCGTC 
6251 TCTACTAAAA ATACAAAATT AGCTGGACAT GGTGGCGCAT GCCTGTAATC 
6301 CCAGCTACTT GAGAGGCTGA GAAAGAAGAA TCACTTGAAC CCAGGAGGCA 
6351 GAAATTGTGA TGAGCCAAGA TCATGCCATT GCACTCCAGC CTGGGCAACA 
6401 CTCCAGCCTG AGCAACAAGA GTAAAACTCC GTCTCAAAAA AAGAAAAAAA 
6451 AAATCAATTA CCATTATTGT TTCACTTATG AGTATTTACC GTGTGCCAGG 
6501 CACTGTGCCA AGCACCTTAC CTGCATTATC TCACATGATC CTCACTCCAA 
6551 CTCTTTGAGG GAAGTACTAC CATTGGCTTC ATTTTATAGA TGAAGAAACT 
6601 GAGGTTCAGA GAGGTTACAT TAAATCTAGC ACCTACCCTG TACCAGGTGC 
6651 TGGAGGAACA GTGGCAAGCA AGACAAAGCC TCTGGATTCG GGGAGCTTAT 
6701 GTCTGGTGGG GGAGGCTGAC AAACATGTAA ACACAGAAAA CTATATATAT 
6751 ATA 1 1 1 1 1 1 1 TGAGATGGAG TTTTGCTCTT GTTGCCCAGG CTGGAGTGTA 
6801 ATGGCATGAT CTCGACTCAC TGCAACCTCC GTTTCCCAGG TTTAAGCAAT 
6851 TCTCCTGCCT CAGCCTCACA GATAGCTGGG ATTACAGGCA TGTGCCACCA 
6901 TGCCTGGCTA Al 1 1 1 IGTAT TTTTAGTAGA GATGGGTTTT CGCCATGTTG 
6951 GCCAGGCTGG TCTCGAACTC CTGACCTCAA GTGATCCGCC TGCCTTGGCC 
7001 TCCCAAAGTG CTGGGATTAC AGGTGTGAGT CTCTGTGCCT AGCCAGAAAA 
7051 CTCTTAAGAG GTATGTATCA GGCTGGGTGC AGTGGCTCAC TGGTGAAAAG 
7101 ATCTGCACCC AAATAGCATG TGACGGGCAG GATTTGGACC CAGGTCTGTG 
7151 TATGCCAGAG CCCAGTGTTT ATCCCTCTGC TCTCTCACCT TCCAAAAAAT 
7201 GGTAATAAAC CATGGTAAGC TAGCTTTTCC CTTTGGGGAC GAGATCCTTG 
7251 GTTTGTCCTA CCCAGGTATG TAGGCAGTGG TCGGGGGTTG GGGGTGGCTG 
7301 AGCTGTCCTG AGCTCTAAAC CGCTGI I I I I I I I 1 1 I 1 1 I I TTTTGAGACA 
7351 GGGTCTTACT CTGTTGCCCA GGCTGGAGTG CAGTGGCTAG TCACAGGTGC 
7401 AATCATAACA GACTGCAGCT TTGAACTGCT GGGGCCAAGT GATCCTCCTG 
7451 CCTCAGCCTC CCAAGTTCCC AAGTAGCTTG GACTACAGGT GCACACCGCC 
7501 ATGCCTGGCT AAACCACCTC ATTTCTCCTT TCAGGATCAA GCACCCCAAC 
7551 ATTGTAGCCC TGGATGACAT CTATGAGAGT GGGGGCCACC TCTACCTCAT 
7601 CATGCAGCTG TGAGTGGCCC AACCTCTGCC CTGCCCCCAC ACCTCTCCCA 
7651 GCTGTCCCAA CCCTCTTTGC CAGACTGCCC TATCCCCTGC TGCAGGGTGT 
7701 CGGGTGGGGA GCTCTTTGAC CGTATTGTGG AAAAAGGCTT CTACACGGAG 
7751 CGGGACGCCA GCCGCCTCAT CTTCCAGGTG CTGGATGCTG TGAAATACCT 
7801 GCATGACCTG GGCATTGTAC ACCGGGATCT CAAGGTGGGG CTCAAGGGGG 
7851 TGTGGTGAGC TAGGGTACCC AGGGGTGGGG CCTTTGCAAA CCCCAAACTG 
7901 TCTGACCTTG GGCAACTTTC ACCCCCTCAC TGAGCCTTGG ATTTCCATCT 
7951 ACAAAGTGGA TCTTGTAACC TTTAAACTGC CTCCTCCCAT TCTAGTCCAG 
8001 ATACTCAAAG GAACACGAGT GAATTGTGTG GCATTTTATC CAAACAACAT 
8051 TTTGTCTTTT TCTGATTAAA AAAAAAAAAA TCTGGCCAGA CAGGATGGCT 
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8101 CACGCCTGTA ATCCCAGCAC TTTAGGAGGC AGAGACGGGT GGATCACCTG 
8151 AGGTCAGTTC GAGACCAGCT TGGCAAAACC CTGTCTCTAC CAAAAATACA 
8201 AAAATTAGCC CGGCGTGGTG GCAGATGCCT GTAATCCCAG CTACTAGGGA 
8251 GGCTGAGGCA GGCGAATCAC TTGGACCCGG GAGGCAGAGG TTGCAGCAAG 
8301 CTGAGATTGT GCCATTGCAC GCCAGCCTGG GCGACAGAGC GAGCCTGGAC 
8351 GACAGAGCGA GACTCCATGT CAAAAAAAAT AAAATAAAAA CAAAAAATCC 
8401 TATTCCCCTT CTGTAGAAAA CTTGGATGGG ACAGCAAAAC ATAAAGAAAA 
8451 AAGCCAGAAA TCCCCGAAAT CCTACTCCTC GGAAATAGCG ACGGGGCTCA 
8501 CATTTAGCAG TACATCTCAA TCCGTTCTAG GAGAAGGGCA CTTGGGGTGT 
8551 GACATGCCTG GTTTTGAATT CTGGCTCTGC TACTGCCTAA CTGTGGGTTC 
8601 TTGGGTGAGT CACTTTGCCT CCAAAGGCAT CAGTTTCCTC ATCTGTTAGG 
8651 TGAGATTATA CAGACTGGCC TAGCAGGGAA GCAGTGAGGA TGGCATTAAA 
8701 TCAAGCACTA ATCCAGGGTC TGGCATAAAA TAGGCATTCA AACATTCCTT 
8751 TAGGGCTTTA CAGTGCACAC CTGAGGTTTA GAGACAGTTC CCCCCCACAC 
8801 CCTCTTGAGC CTTGTCCTTC CTGGAATTTT TGGCCTTCTT GAGAGCTTCC 
8851 TTGATTTTCT TATGACAGCC ATGAAGCCAC AGTGGCTTTT GGGGATCCAT 
8901 TATTTCTCAG AAGGTGCTTG GAGCGGCAGA AGGTTCTACC AGCCTCTAAC 
8951 CATCTCTGAT TGCCCCTTCT CTTCCCTCCT GCCCTTCAAG CCAGAGAATC 
9001 TGCTGTACTA CAGCCTGGAT GAAGACTCCA AAATCATGAT CTCCGACTTT 
9051 GGCCTCTCCA AGATGGAGGA CCCGGGCAGT GTGCTCTCCA CCGCCTGTGG 
9101 AACTCCGGGA TACGTGGGTG CGGAGGGCCC TGGGCTGGGG CTGTGATGGT 
9151 GGGGGGAACC AGGAGTTGAA GGGCAGAGAT TTGTCACCAC CACGTCCTCT 
9201 TCCCTCCACA GCCCCTGAAG TCCTGGCCCA GAAGCCCTAC AGCAAGGCTG 
9251 TGGATTGCTG GTCCATAGGT GTCATCGCCT ACATCTTGTA AGTGGGGCTT 
9301 GGCCATGGTA GGCTGTGGCT CCAGAGTTGT CCTCTCGCCT ACTTTCCTCT 
9351 CTTCCTTCCT CTGCTCTCCC TCTGCCCTCC CTTCCTTCCC TCCCTCCCTT 
9401 CCTTCCACCA ATCAATTACC AGTATTACTT CATTCAATAG ATACTATGTT 
9451 TCAAGCACTG TGCCAAGCAA GCACTGGGGT AAATTTAGCA CAGCACAAAC 
9501 CAGACAAAGT GCCTGCCCTC AGGGAGCTGA CTTTCTTTCT AGTAGGGAAG 
= y 9551 ACAGACAATC AACAAGTAAA TAAATCTACA AACTGACGTC AGGTGATAAA 
9601 AATAAATACT GTGGAGAAAA ACCAAGCAGG AATAGGGAGA CGGGGTGATG 
P 9651 CCATTTCAGT AGGGAGGTCA GGGAAGGGCT CGCTGTGGAG GTGATGACCG 
W 9701 AGTGGTGAGG GAGCCAGACA TTGGAGGTGT GGGGAAAGAG TGGCATAGGC 
P 9751 AGAAGCAATG GCAAGTGCAA AGGCCCTGAG GAGGGCAAGA TGGCGGCACA 
111 9801 TACAAGGAAC AGAAAGGATA ATGTAGCTAG AACAGGAGTG AGCAGGCAGG 
O 9851 GCTGGTAGAG TTTATAAAGG GGGAACTCCT TCCATGGCTC CTGCCTGACC 
SlJ 9901 CCTGAGACTG CCCCAGTGCT CCACCCCGGA GCCAACGGCA CCCGAAAGTG 
9951 GAAATGAGGA TGAGTTTCTC CCTGCCCAGG CTCTGCGGTT ACCCTCCCTT 
10001 CTATGACGAG AATGATGCCA AACTCTTTGA ACAGATTTTG AAGGCCGAGT 
10051 ACGAGTTTGA CTCTCCTTAC TGGGACGACA TCTCTGACTC TGGTATTTGG 
10101 GGCTTTGCTT TTTTCCCCTG GGCCCTGCCT CTGGTTCCTC CCTCACCTGC 
10151 TTTGGGGGCG GTCTCCCTCC TGCCTTCCTT CTGTCGGATT TTCCAGCACC 
10201 ACACAAAGAG CTGTCTTCGA GACCAGACAC CCTACCCCTT CTTCCTTCTG 
10251 CTTGGGTACT TCCTTCTGCT TGGCTCCCAG AGTGAGAAAC TAGGCATTCA 
10301 TTTGTTCAAT CTTCAAACAT AGTCTATTTG AAAATACCTC TCCCCTATTG 
10351 ACACCCTAAT GTCTAAAACA CCACCATAAA CATTTTCATC CTCCTTTTGT 
10401 GCCCCCTATT AAGAAGCAAA CCTGTGAAGC TACTATCGTT TATCATCAGT 
10451 GTGAATGCAC TGAGATTAGT CAAGAACAAC I I I I 1 1 I I I I TTTTCTTTCT 
10501 I I I I IGAGAC GCAGTCTCGC TCTGTTGCCC AGGCTGGAGT GCAGTGGCAC 
10551 AATCTCGGCT CACTGCAACC TCTGTCTCCC GGGTTCGAGC AATTCTCTGC 
10601 CT CAGCCTCC CAA GTAGCTG GGATTACAGG CGCCCACCAC CATGCCCGGC 
10651 TAAI I I 1 1 I I TGTAI I 1 1 IA GTAGAAACAA GGTTTCACCA TCTTGGCCAG 
10701 GCTGGTCTTG AACTCCTGAC CTCGTGATCC ACCTGCATTG GCCTCCCAAA 
10751 GTGCTGGGAT TACAGACATG AGCCACTGTG CCCGGCCATA TGI I I I ILI I 
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10801 AAGAGAGAAA GGAAAGAGCT GGAAGGCACG GGGTGGGAGG GCCTGAAGAA 
10851 GAGCATAGGT TGQGTGGGGT GGGGCATGGA CTGATTTGGC CTCTTTGTCT 
10901 TGATGCCAGG CCAGACCTGA GGGAGTGGGT ATGCTCTTGG GGAGTACACA 
10951 GGCAGTACCA TGCTGTCATT ATCTTTGCTT TTGrOTGQG GGTTTAGCCA 
11001 AAGATTTCAT CCQGCACTTG ATGGAGAAGG ACCCAGAGAA AAGATTCACC 
11051 TGTGAGCAGG COTGCAGCA CCCATGGTGA GAATTCACAC AACCTGTGAG 
11101 CTGGGGCGGG ATTTGGGGCC CTCAGGTCTG CTTCTGCCCT CATAGGCAAC 
11151 CCACCACATA ACCCCATCCT AGGATTGCAG GAGATACAGC TCTAGATAAG 
11201 AATATCCACC AGTCGGTGAG TGAGCAGATC AAGAAGAACT TTGCCAAGAG 
11251 CAAGTGGAAG GTGAGTCCAT ATCCCTAGTT CTGGTCCCAG CCTCCCCAGG 
11301 ACTCCTCCCC ATCCCTACCC AGGCTCAGCT TGCACAGCAC CTGGCATCAC 
11351 ACTGGGCACA CAGTAACTGC TTAGGGATCC TTACTGAAGG ACTTCATTCA 
11401 TTCACTCTTT CATTCAACAA ACACTCCCAA CACCTTCTCT ATTCCAGAGA 
11451 GGGTCCCTCA CCTCCAAGTC TAGAGGAAGA AGTCTGTAAT TCTTCAGGAG 
11501 GCATCTGATC CAGCCTATGG GGTCCGAGAA AGGTCATAAA AGTGGTGATG 
11551 ACCTGACAGA GCTGTCAGTT AAGTAGGAAT TAGTGAGGCA TAGCGGAATA 
11601 ATGTCTATAG CCATTCCGGG AAGTGCAAGT GCTAAGCCTG GCCAGACTGG 
11651 AGGGGCTGAG GGGACTGAGA GGCAGGAGCC CAATTTAGAG AAGCAGGTAA 
11701 GGGGCCAGGC CTCTTAGGGC CTCATATGCC ACAGAGGAGC ACCAACTTGA 
11751 TCCTGAGGGC ACTGAGGAGC CCCAGAAGAA TCTTAGGCAA GTATTTGCTG 
11801 CATAGAAAGG GCTCTCAGGG CCAGGCATGG TGGCTCACGC CTGTAATCCC 
11851 AGCACTTTGG GAGGCCGAGG TGGTTGGATC ACCTGAGGTC AGGAGTTCAA 
11901 GACCATCCTG GCCAACATGG CAAAACCCTG TCTCTACTAA AAATAAAAGA 
11951 ATTAGCCACA CATGGTGGTG CGTGCCTGTA ATCCCAGCTA CTTGGGAAGC 
12001 TGAGGCAGGA GAATTACTTG AACCTCGGAG ATGGGGGTTG CAGTGAGCTG 
12051 AGATCGCGCC ACTGCACTCC AGCCTGGGCA ACAAAGTGAG ACTCCACCTC 
12101 AAAAAAAAAA GAAAGAGCTC TCAGGATGCA GAGAATGGCA TGGAGTAAAG 
12151 ACTGGGTGAC GCATTAGGAG GCTGTGGCAG AGATACAGGC AGGAGATGGT 
12201 AAGGGTTTGG AACCACAGTA GCAGCAACAG GGGGCAGAGA ACAGTGGTTG 
12251 ATCCAGGAGT CATTTAGGAG GTGAAACTGA CAAGACATGA CGATGCAATG 
12301 GATGTTGGGG GAAAGAGATG TCAAGGGCTG GCCCAAGACT GTGGCTGGGA 
12351 ACAGAATGGA TGGTGGTGGT ACCATGACTG AGATGGTTAT CACAGGGACA 
12401 GAAACATGTT TTGGGGGGAT GGTTTTAGTT TTAGACATGG TGAATTTGAG 
12451 GGGTGTGTGG GACACCTAGG TGGAGATATT GAATAGAGAC ACACCTGAGC 
12501 AAGTTACTTC AGCTTTCTGT GCCTCAGTTT CCTCCTTTGA AAATGATAAT 
12551 AGTACCTACC TCAAAGACTT TCATGAAGAT TAAATGAATT ACTACGTAAA 
12601 GTGCTTAGAA CAGTGCCTGA CATACAGTGC TATAGTGTTT GCTATTACAT 
12651 ATTAATATGA ATTATAGTTA TGTTTCTATT TATATATATA GATACACATA 
12701 CAT CTAACAT ATGTGCGTGT GTGTGTGTAA ATATATAATA AAGCCTTGTA 
12751 GAGGTTTTTG GGGGGCTTTA GGGGAATTAA TAAAATAACT CCTGAATGAA 
12801 AATAACAGAA CAATTGCAAG AATCCCACTG CGCCCCTGCC CCATGACTTG 
12851 ACTCTCTCAA AAGTCCTTTC TCCCCTCTCC CTTCAATGCC TTCAATGCCA 
12901 GCAAGCCTTC AATGCCACGG CTGTGGTGCG GCACATGAGG AAACTGCAGC 
12951 TGGGCACCAG CCAGGAGGGG CAGGGGCAGA CGGCGAGCCA TGGGGAGCTG 
13001 CTGACACCAG TGGCTGGGGG TGAGGAGCGG GCTCTGCAGA AGGGCATGGG 
13051 TGGTCCACAA AGGTGCACCC GGGCTGGAGT GGAGGGCCTG CCCCTGCGGC 
13101 CACCTCTGTT CTGTCTTCCC ATGCAGGGCC GGCAGCTGGC TGTTGCTGTC 
13151 GAGACTGCTG CGTGGAGCCG GGCACAGAAC TGTCCCCCAC ACTGCCCCAC 
13201 CAGCTCTAGG GCCCTGGACC TCGGGTCATG ATCCTCTGCG TGGGAGGGCT 
13251 TGGGGGCAGC CTGCTCCCCT TCCCTCCCTG AACCGGGAGT TTCTCTGCCC 
13301 TGTCCCCTCC TCACCT GCTT CCCTACCACT CCTCACTGCA TTTTCCATAC 
13351 AAATGTTTCT ATTTTATTGT TCCTTCTTGT AATAAAGGGA AGATAAAACC 
13401 ATCCTTAGCG CTGTCTCCCT CAATATCCCC CACCCCATCT TGTTGTGCAA 
13451 ACTGACTGCT TGATTTGGGG GTGCCTGGCC TTTGAGGTAG TCACAGGGAG 
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13501 GCCCCTCCCC AACATGAGAC TGGGTGGGGA TGGGGAGAGA GAAGTGGGGA 
13551 ATGGAGGGGA AGGTGCTTGG GGAAI 1 1 C_ I I TGTCCAGGGT GCCCCATCTA 
13601 GCCTTCCGGC COTTGGAAC CCTTTCTGCG CTTTGCTGGT GGCTCCTGAG 
13651 CATGGCGGGA TTGGCGCAGG TCGGCACTGA ACAGCACCTG TAGGAGGGTG 
13701 GAGTCTGTGT GGGGAGGAGG GTACACTGGG GTCAGGGCTG GTGAGACTAG 
13751 TGACAGTGTT GGGAGGTGGA AGAGTCCTTG GGGAACAGGG CCGAAGGCAA 
13801 TGAGAATCCA CTGGGGTTGG GACAGGGGTG GCTGGAGAGT CCTTTAGGGC 
13851 CACCTGGGGC GGTGGTGGAA GAGTCCACTG GGTCTGGGCT GGAGGAGAGG 
13901 AAACCTAGGG AGGACACCTA GGTACACTCA CCGCTTGGGC CCAGCCAGCA 
13951 TAAGGTCCCC ACAGGCTCCG GAAAAAGTTT CCTAAATCAG AAGTGATGAG 
14001 ACTAAGTTAT CTGACCCCTT CTGTGACCCA TCAACAGAAG TAGGGTCTGA 
14051 GGGAGAGGTG ACTAAGAGAG AGAGAAGTTT CTACCATCCC AGCCCACTGC 
14101 CAGCCCCTGC AGCCCACTTT CCTCACCCAG TTCCTTGTTG GTCTGGGGGC 
14151 TCGGTCCCTT CGCCTGGGAC GTGGTAGGGT GCCAGCTGTA GTCACGTTGG 
14201 GCAATGTGCC ACATATGGAC ATCCACGGGC ACAGCCTGGG GCTTGTCTAG 
14251 GGCCATCAGG CAGATGCAGT CAGCCACCTT TGACAGACAC AGAATGAGCC 
14301 CTTGTGGAAG AAGGGCAGCA TGTGGCCAGC ATCTTGCTTA TAGCCCCAAA 
14351 GCCGGCTGCT TTCTCCTTCA CTCTGGGGTT ACTGTTGTTC TATATTCTCA 
14401 ATCAACAGAT ACTATCTATG AATACACTTT 1 1 1 1 1 IGI 1 1 GUM I G AGA 
14451 TGGAGTCTCG CTCTGTTGCC TAGGCTGGAG TGCACTGGTG CAATCCTGGC 
14501 TCTCCCAGGT TCAAGCAATT CTCCTACCTC AGCCTCCCAA GTAGCTGGGA 
M 14551 TTACAGGCAT GTGCCACCAC GTGTGGCTAA I I 1 1 IGTGTT TTTAGTAGAG 
P 14601 ATGGGGTTTC ACCATGTTGG CCAGCCTGGT CTCGAACTCC TGACCTCAAG 
q 14651 TGATCTGTCC ACCTTGGCCT CCCAAAGTGC TGGGATTACA GGCGTGAGCC 
2 14701 ACCATGCGCG GCCTATGAAT ACACTGAAAT TGCTGTAATA AGAGGTGCTA 
q 14751 CTAGCTGAAC ACCTATGTGG GCCAGGTTAT CATAACCTGG GAAGAAGGTA 
m 14801 TTACCACACC CACTTTACAG ACAAGAAAAC TGAGGCTTTG AAAGGTGAAG 
= 14851 TGACCTGGCC AAAGTCACAT GGCTGAGAAT AGGCAGAACC AAGATTTAAT 
5j 14901 GTTAGGCTGT AGTCCAAAGC CCATCAAAAA AAAATCTTTA AGCAAAAATT 
iy 14951 CAI I I 1 1 IAA ACTACAGAGA AGTATAAAGA AAAAAAAAGG CTGGGTGCAG 
" 15001 TGGCTCACGC CTGTAATCCC AGCACTTTGG GAGGCTGAGG CAGGTGGATC 
H 15051 TCGAGGTCAG ATTGAGACCA TCCTGGCCCA ACATGGTGAA ACCCCATCTC 
W 15101 TACTAAAAAT ACAAAAATTA GCTGGGTGTG GTGGCGCATG CCTGTAATCC 
P 15151 CAGCTAATCT GGAGGCTGAG GCAGGAGAAT AGCTTGAAGC CGGGAGGCGG 
ill 15201 AGGTTGCAGT GAGCCGAGAT TGCACCACTG CACTCCAGCC TGGCAACAAA 
Q 15251 GCAAGACTCC ACCTCAAAAA AAAAAAAAAA AAGACAAATG CCTAATTTCC 
!U 15301 AGTCATCTTA TTGCCAGTTA ACCCTATTGA CATCAAGCAA AAAGTTTTGT 

15351 CAGTACATGT CATTTTACGA AAGGAACAAA ATGTGGCCGG GAGCAGTGGC (SEQ ID NO: 3) 

FEATURES: 

Genewise results: 

Start: 3000 

Exon: 3000-3082 

Exon: 7535-7609 

Exon: 7696-7834 

Exon: 8991-9117 

Exon: 9212-9287 

Exon: 9980-10092 

Exon: 10998-11076 

Exon: 11173-11260 

Exon: 12902-13019 

Exon: 13127-13206 

Stop: 13207 
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sim4 results" 

Exon: 30o6-3082, (Transcript Position: 1-83) 

Exon: 7535-7609, (Transcript Position: 84-158) 

Exon: 7696-7834, (Transcript Position: 159-297) 

Exon: 8991-9117, (Transcript Position: 298-424) 

Exon: 9212-9287, (Transcript Position: 425-500) 

Exon: 9980-10092, (Transcript Position: 501-613) 

Exon: 10998-11076, (Transcript Position: 614-692) 

Exon: 11173-11260, (Transcript Position: 693-780) 

Exon: 12902-13019, (Transcript Position: 781-898) 

Exon: 13127-13209, (Transcript Position: 899-981) 



CHROMOSOME MAP POSITION: 

chromosome 3 



ALLELIC VARIANTS (SNPs) : 

DNA 



Position 


Manor 


Minor 


Domain 


1892 


A 


G 


Intron 


3351 


G 


A 


Intron 


8636 


T 


A 


Intron 


8805 


T 


C 


Intron 


9802 


T 


A 


Intron 


9833 


C 


G 


Intron 


11352 


C 


T 


Intron 


13319 


T 


C A G 


intron 


13659 


C 


G 


Intron 


14292 


G 


C 


Intron 



Context: 



DNA 

Position 

1892 GGTGTTCGGGTGCAGGGCTGTCTGGGGCACTGTGTGGTGT^ 
GGGAGTACATGTATGATCAGGTGTCACGGGATCT 
GGCAGGTGTTTGAGTTCAGGGCTGTGGAGGGGGCTTGGTGTGGCATCT 
TGTGTGTGGATCTGTGAGGGTTGT ATTTGGTAGGCCTCCATGTGGGTTTCAGACTCTGCC 
TCTAGAGCTTACACTCGAGTCTCCTTTCCT 
[A,G] 

GTCCCCTGGGAAAMGGTCCTGTTCL^GGAGTGG 

GGCACCITCTCCTCATTCTCCCTTAGAG 

TCACAMGCATGCTGGGCTCrGTTTTGGCCTCATCTGT 

CTCTGAATGGGGCCCATTCTGGCTTCATATTGGMCT 

CTTTGCCCTCAGCAAGCTTAGCTGGGGGCCCCAGGCCAGGTGTCA™ 

3351 GACATCTACGALTTCCGAGATGTTL^^ 

ATGGCTGAGGGAGGCTGAGTCCAGGGTGGGGCTTCCTCTGGTCMTTMTGCTTCCT 
TCCO^CAGCCOVGGCCCTGTGG^ 

CTTCTGGATGA(^GTGGGTGGGACACGGGCrGCTCrCCCAATAGCCCTGGGI I LI IGAAG 

AGAMGMGTCGAGAGMTGMGGTGCCAGTCAGTCCATTTMCTTGCTG^ 

[G,A] 

GTGTTCTAGCCTAGGTTTGGGMCTGaGGCTGGAGATGGCTCTG I I LI IGGTGCTGGGAA 
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TGCAGAMTMCrCAMCCTGGTCrcrGCCCTTCAAGTTGATCCCAGACATGrGCAAGAG 
ACAGACCrA^GAAMTGACMCAGCXJTGrGTGCTGTGCrCCMTTAAGGTTGGGATTGA 
GGGCTITGrGGAC^CCAGAGAGAGCrGrGCCiTCrGCCrGGGGGAAMCTrCCTGGAGM 
TGGGGCATTAGAGCrGGGGACrGMGGATGGGTAGGTCTGG^CrrGTGVGAGAGGMGM 

AGAGCGAGCCTGGACGA(^GAGCGAGA(TCCATGrOVWWW^TAAMTAAAMaW\A 

MTCCTATTCCCCTTCTGTAGAAMGTGGATGGGACAGGW\AGW 

AGAMTCCCCGAMTCCTACTCCrCGGAAATAGCGACGGGGCTCACATTTAGCAGTACAT 

(TCMTCCGTTCTAGGAGMGGGCACTTGGQGTGrGACATGCCTGGT^ 

TCTGCTACTGCCrAACTGTGGb I I LI I GGGTGAGTCACTTTGCCrCCAAAGGCATCAGTT 

[T,A] 

CCTC^TCTGTTAGGTGAGATTATACIAGA 
TAMTCMGG^CTMTCCAGGGTCTGGCATAAMTAG^ 

TTTACAGTGCACACCTGAGGTITAGAGACAGrrCCCCCCCACACCCrCrrGAGCCTTCTC 
CTTCCTQGAAI I I I IGGCCI ICI I GAGAGCTTCCTTGA I I I IU I ATGACAGCCATGAAG 
CCACAGTGGCTITTGGGGATCCATTATTTCrCAGAAGGTGC^ 

TAGCAGTACATCTCMTCCGTTCrAGGAGMGGGCACrTGGGGTGTG^ 

TGMTTCr GGCTC TGCTACTGCCT AACTGTGGG I ICI I GGGTGAGTCACTTTGCCTCCAA 

AGGCATCAGTTTCCTC^TCrGTTAGGTGAGATTATACAGACTGGCCTAG 

TGAGGATGGCATTAMTCMGCACTMTCCAGGGTCTGGG\TAAAA^ 

TTCCTTTAGGGCTrTACAGTGCAGVCCrGAGGTTTAGAGACA 

[T,C] 

TGAGCCTTGTCCTTCCTGGA A I I I I IGGCCI ICI I GAGAGCTTCCTTGA I I I ICI IATGA 

CAGCCATGMGCCACACTGGCTTTTGGGGATC 

GCAGMGGTTCTACCAGCCTCTMCCATCTCTGATTGCCCCTTCT 

TCMGCCAGAGAATCTGCTGTAOACAGCCTGGATGMGACTCCAAMTOT 

ACTTTGGCCTCTCCMGATGGAGGACCCGGGCAGTGTGCTCrCCACCGCCTCT 

AGACAAAGTGCCTGCCCTCAGGGAGCTGACI I ICI 1 1 CTAGTAGGGAAGACAGACAATCA 

ACAAGT AMTAMTCTAC^CTGACGTCAGGTGATAAAMTAMTACTGTGGAGAAAM 

CCAAGCAGGMTAGGGAGACGGGGTGATGCCATITCAGT AGGGAGGTCAGGGAAGGGCTC 

GCTGTGGAGGTGATGACCGAGTGGTGAGGGAGCCAG^ 

GGCATAGGCAGMGCMTGGIMGTCXIAAAGG^ 

[T,A] 

CMGGAACAGAAAGGATAATGTAGCT AGMG^GGAGTGAGC^GGCAGGGCTGGTAGAGTT 
TATAMGGGGGMCTCCTTC<^TGGCTCCTGCCrGACCCCTGAGACT^ 
ACCCCGGAGCGV\CGGCACCCGAMGTGGAMTGAGGATGAGTTTCTCCCrGCCG^GGCT 
CTGCGJITACCCTCCCTTCTATGAC^ 

GGCCGACTACGAGTTTGACTCTCCTTACTG 

TTCTTTCT AGTAGGGMGAO^GACAATCAACMGTAM 
GTGATAAAMTAMTACTGTGGAGAAAAACCAAGC7\GGAATAGGGAGACG 
ATTTO\GTAGGGAGGTCAGGGMGGGCTCGCTGTGGAGGTGATGACCGAGTGGTG^ 
GCCAGACATTGGAGCTCTGGGGAMGACTGGC7VTAG 

GCCCTGAGGAGGGCMGATGGCGGCACATACMG AGAA 
[C,G] 

AGGAGTGAGCAGGCAGGGCTGGTAGAGTTTATAMGGGGGMCTCCTTCCA^ 
CCTGACCCCTGAGACTGCCC(^GTG(TCCACCCCGGAGC(^CGGG\CCCGAMGTGGAA 
ATGAGGATGAGTTTCTCCCTG CCOVGG CTCTGCGGTTA^ 
GATGCCAAACTCTTTGAACAGATTTTGAAGGCCGAGTACGAGTTTG^ 
GACGAC^TCTCTGACTCTGGTATTTGGGGCTITGC I I I I I I CCCCTGGGCCCTGCCTCTG 

GTGAGCAGGCCTTC^GCACCG^TGGTG^ 
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TTTGGGGCCCTCAGGTCTGCTrCTGCCCTCATAGGC^ 

GGATTGGkGGAGATA^GCrCTAGATMGMTATCCACCAGTCGGrGAGTGAGCAGATCA 

AGMGMGTTGCCMGAGCMCTGGMGGTGAGTCCATATCCCrAGTTCT 

CTCCCCAGGACTCCTCCCCATCCCTACCCAGGCTCAGCTTGC^^ 

[C,T] 

TGGGCACACAGTAACTGCTTAGGGATCCTTACTGMGGA CA 
TTCMCAMCACTCCCMCACCXrCTCTATTCC^ 

GAGGMGMCTCTGTMTTCTTCAGGAGGG^TCTGATCOXGCCrATGGGGTCCG^GAAAG 
GTCATAAAAGTGGTGATGACCTGACAGAGCT^ 

GCGGMTMTGTCTATAGCCATTCCGGGMGTGCAAGTGCrAAGCCTGGCG^GACTGGAG 

13319 GGTGAGGAGCGGGCTCTGCAGMGGGCATGGGTGGTCCACA 

GTGGAGGGCCTGCCCCTGCGGCG^CCTCTGTTCTGTCTTCCCATG 

GCTGTTGCPGT CGAGACTGCrGCGTGGAGCCGGGCACAGAACTGrCCCCCACACTGCCCC 

ACCAGCTCT AGGGCCCTGGACCT CGGGTCATGATCCTCTGCGTGGGAGGGCTTGGGGGCA 

GCCTGCTCCCCTTCCCrCCCTGMCCGQGAGTTTCTCTGCCCTCT 

[T,C,A,G] 

TCCCTACCACTCCTCACTGCATTTTCCATACAMTGTT^ 
TMTAMGGGMGATAAMCCATCOTAGCGCTGTCTCCCr<^ 
TTGTTGTGCAMCTGACTGCITGATTTGGGGGTGCCTG 
GGCCCCTCCCCAACATGAGACTGGGTGGGGATGGGGAGAGAGAAGTGGGGMTGGA 

H AAGGTGCTTGGGGAAI 1 1 LI 1 1 GTCCAGGGTGCCCCATCTAGCGTCCGGCCGTTGGAA 

f3 

q 13659 CTATTTTATTGTTCCrrCTTGTAAT^ AGCGCTGTCTCC 
} CTCAATATCCCCCACCCCATCTT GTTGTGCAMCTGACTGC1TGATTTGGGGGTGCCTGG 
'm CCTTTGAGGTAGTC^CAGGGAGGCCCCTCCCCAACATGAGACTGGCT 
g GAGMGTGGGGAATGGAGGGGAAGGTGCTTGGGGAA 1 1 1 CI 1 1 GTCCAGGGTGCCCCATC 

g TAGCCTTCCGGCCCrrrGGMCCCTTTCTGCGCTTTGCTGGTGGCT 

ik ATTGGCGCAGGTCGGCACTGMCAGCACCTGTAGGAGGGTGGAGTCTGTGTGGGGAGGAG 

L GGTACACTGGGGTCAGGGCTGGTGAGACTAGTGACAGTGTTGGGAGGTGGMGAGTCCTT 

H GG GGMC AGGGCCGMGGCMTGAGMTCCACTGGGGTTGGGACAGGGGTGGCT 

2 TCCTTTAGGGCCACCTGGGGCGGTGGTGGMGAGTCCACTGGGTCTGGGCTG^ 

W GAMCCTAGGGAGGACACCTAGGTACACTCACCGCTTGGGCCCAGCCAGCATAAGGTCCC 

if! 

Q 14292 AGTGATGAGACTMGTTATCTGACCCCTTCTGTGACCCATCM 
fU GGAGA GCTGA CTMGAGAGAGAGMGTTTCTACCATCCCAGCC 

GCCCACTTTCCTCACCCAGTTCCTTGTTGCT 

TGGTAGGGTGCCAGCTGTAGTCACGTTGGGCMTGTGC 

CAGCCTGGGGCTTGTCTAGGGCCATCAGGCAGATGCAGTCAGCC^^ 

[G,C] 

MTGAGCCCTTGTGGMGMGGGCAGCATGTGGCCAGCATCTTGCTTATAGCCCCAAAGC 
CGGCTGCTTTCTCC TTCACTCTGGG GTTACTGTTGTTCT 

TATCTATGAATACAU I I I 1 1 1 1 IGI I IGI 1 1 1 I GAGATGGAGTCTCGCTCTGTTGCCTA 

GGCTGGAGTGCACTGGTGCMTCCTGGCTCTCCCA 

CCTCCCMGTAGCTGGGATTACAGGCATGTGCCACCACGTGTG^ 



FIGURE 31 



